This is my work-in-progress attempt to use the nhdplusTools package to select WQP sites within WSR watersheds and to identify their distance away from the designated WSR reach. Currently the goal is not to identify the exact distance away from the WSR, but rather a general distance (0-5 km upstream, 5-10 m upstream, etc.), mostly becuase I have not been able to figure out a way to calculate the exact distance under this sort of methodology. Other ways that I have explored to tackle this assignment have been using the package riverdist (ran into problems for large datasets with rivers in different watersheds), or to raster flow acc to calculate distance from a certain point (still thinking there might be a way to do it this way). I am using a small subset of the WSRs because it takes forever to do with the full system and often crashes my computer. Major problems that I am having: -How to iterate within pre-made functions from packages -How to use functions -Running large datasets

WQP SITES UPSTREAM OF WSRs

Reading in WSR flowlines, watersheds, and associated upstream starting points:

wsr_flowlines <- st_read('data/wsr_flowlines.shp') %>%
  st_transform(4326)
## Reading layer `wsr_flowlines' from data source `C:\Users\kwilli\Documents\WR674\ClassDirectories\wsr_spatial\data\wsr_flowlines.shp' using driver `ESRI Shapefile'
## Simple feature collection with 10 features and 19 fields
## geometry type:  MULTILINESTRING
## dimension:      XY
## bbox:           xmin: -73.15059 ymin: 41.32011 xmax: -70.90709 ymax: 42.74056
## epsg (SRID):    4269
## proj4string:    +proj=longlat +datum=NAD83 +no_defs
wsr_watersheds <- st_read('data/wsr_watersheds.shp')%>%
  st_transform(4326) 
## Reading layer `wsr_watersheds' from data source `C:\Users\kwilli\Documents\WR674\ClassDirectories\wsr_spatial\data\wsr_watersheds.shp' using driver `ESRI Shapefile'
## Simple feature collection with 8 features and 4 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 1850760 ymin: 2282985 xmax: 2057221 ymax: 2452220
## epsg (SRID):    NA
## proj4string:    +proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs
wsr_start_point <- st_read('data/nhd_snapped_upstream_points.shp') %>%
  st_transform(4326)
## Reading layer `nhd_snapped_upstream_points' from data source `C:\Users\kwilli\Documents\WR674\ClassDirectories\wsr_spatial\data\nhd_snapped_upstream_points.shp' using driver `ESRI Shapefile'
## Simple feature collection with 36 features and 1 field
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: -73.14829 ymin: 41.38191 xmax: -70.94317 ymax: 42.74023
## epsg (SRID):    4269
## proj4string:    +proj=longlat +datum=NAD83 +no_defs
mapview(list(wsr_start_point,wsr_flowlines,wsr_watersheds))

Reading in WQP site data, as a table and as an sf object:

wq_table <- read_feather('data/unique_site_inventory.feather') %>%
  filter(source == 'WQP') #removing LAGOS

wq_spatial <- wq_table %>%
  st_as_sf(.,coords=c('long','lat'), 
           remove = FALSE,
           crs=4326) 

Selecting only WQP sites that are within WSR watersheds:

wq_within <- wq_spatial[wsr_watersheds,] %>%
  st_zm() #for future functions
## although coordinates are longitude/latitude, st_intersects assumes that they are planar
## although coordinates are longitude/latitude, st_intersects assumes that they are planar

Reading in NHDPlusV2.1 Flowlines (pre-clipped ro WSR Watersheds), then converting into point features:

flowlines <- read_sf('D:/WSR_Project/data/clipped_nhdplusv2.shp') %>%
  st_zm() %>%
  st_transform(4326)

flowlines_points <- flowlines %>%
  st_coordinates() %>%
  as_tibble() %>%
  st_as_sf(.,coords=c('X','Y'), 
           remove = FALSE,
           crs=4326) %>%
    st_join(., flowlines, join=st_nearest_feature, left=TRUE)
## although coordinates are longitude/latitude, st_nearest_feature assumes that they are planar
#mapview(list(flowlines,flowlines_points))

I pre-snapped WSR upstream starting points to the NHD, so identifying which NHD feature that the WSR starts on is as simple as using the “st_join” function with the flowlines and WSR xy starting points:

wsr_start_nhd_join <- st_join(wsr_start_point, flowlines, join = st_nearest_feature, left=TRUE) %>%
  select(., 'WSR', 'REACHCODE', 'COMID', 'LENGTHKM')
## although coordinates are longitude/latitude, st_nearest_feature assumes that they are planar

Identifying the closest NHD flowline feature to each WQP site is a bit messier:

nearest_nhd_to_wq <- wq_within %>%
  st_join(., flowlines_points, join=st_nearest_feature, left=TRUE) %>%
  group_by(SiteID)
## although coordinates are longitude/latitude, st_nearest_feature assumes that they are planar

Sometimes, these WQP features are quite far away from any NHD flowline feature, therefore I must identify the distance between the WQP site and its nearest NHD flowline feature…

relate_nhd <- wq_within %>%
  st_join(., flowlines_points, join=st_nearest_feature, left=TRUE) %>%
  st_set_geometry(NULL)
## although coordinates are longitude/latitude, st_nearest_feature assumes that they are planar
spatial_related_nhd <- relate_nhd %>%
  st_as_sf(.,coords=c('X','Y'), 
           remove = FALSE,
           crs=4326) %>%
  group_by(SiteID)

mapview(list(nearest_nhd_to_wq,spatial_related_nhd))

I am stuck on how to calculate the distance between each SiteID and its associated NHDFlowline feature. I found a resolution for this problem on https://stackoverflow.com/questions/54887209/compute-pointwise-distance-by-group-in-r-with-sf-dplyr, but when I try it with my data it says - “st_crs(x) == st_crs(y) is not TRUE”. I’m not really sure how to tackle this issue.

# nearest_nhd_to_wq %>%
#   mutate(dst=map2_dbl(SiteID, geometry,
#                       ~ st_distance(.y, spatial_related_nhd %>% filter(SiteID == .x) %>% pull(geometry))
#   ))

Once I get this problem figured out and I’ve identified the distance between the WQP sites and their closest NHD flowline feature, I can start to identify WQP sites that are some distance away from the WSR starting point… I will need to do this iteratively over all of the WSR upstream starting points. I’m still struggling on this as well. What I have so far, for upstream distance of 5 km:

#for SINGLE point, identifying flowlines upstream can be done with (still struggling with doing this iteratively):
upstream_test <- get_UT(flowlines, 6097391, distance = 5) %>%
  as_tibble() %>%
  rename(COMID = value) %>%
  inner_join(.,nearest_nhd_to_wq, by='COMID') #identifies WQP sites that are associated with these flowlines
## Warning: Calling `as_tibble()` on a vector is discouraged, because the behavior is likely to change in the future. Use `tibble::enframe(name = NULL)` instead.
## This warning is displayed once per session.

BUT this way does not take into account the added distance that WQP sites may actually be. I need to add the additional distance between the flowline and the WQP site to its total distance from the WSR upstream starting point. I am hoping to develop some sort of filter where if the LENGTHKM + added_distance <= 5, then that WQP site can be identified.

Because I am hoping to identify WQP sites that fall within certain distances upstream (e.g. 0-5 km, 5-10 km, 10-25 km, etc…), would nesting these lists of WQP sites upstream at 0-5 km, 5-10 km, etc. within the WSR starting point dataframe be the best way of tackling this?

Here is some incomplete, WIP code I’m trying to use to do this:

# upstream <- function(x, y) {
#   
#   for(i in 1:x$COMID
#       upstream_list <- upstream_test <- get_UT(y, x$COMID[i], distance = 5) %>%
#         as.tibble()%>%
#         rename(COMID=value) %>%
#         inner_join(.,wsr_start_point, by='COMID') %>%
#         filter(LENGTHKM + "added_distance_column" <=  5) %>%  #filter out wqp sites that are beyond 5 km away including added distance from snapped flowine. 
#         nest(-upstream_list[i], )
#       return(upstream_list)
# }

#yikes

Even if I get this stuff sorted out, WHAT ABOUT TRIBUTARIES WITHIN THE WSR BOUNDARY? The way I am doing it does not trace upstream into the tributaries that drain WITHIN the WSR boundary… manually creating point features along these tributaries and adding them to the wsr upstream points would work but I am wondering if there is a better way of doing this…

WQP SITES WITHIN WSR BOUNDARY

To identify WQP sites ALONG a WSR, I’m thinking that I can create a quarter mile buffer around the WSR flowline features, then select the WQP sites within that buffer? I’m choosing 1/4 mile because it is the limit to designated WSR past the bed and banks of a WSR.

wsr_boundary <- st_buffer(wsr_flowlines, 1, endCapStyle='FLAT') %>%
  st_transform(4326)
## Warning in st_buffer.sfc(st_geometry(x), dist, nQuadSegs, endCapStyle =
## endCapStyle, : st_buffer does not correctly buffer longitude/latitude data
## dist is assumed to be in decimal degrees (arc_degrees).
mapview(list(wsr_flowlines, wsr_boundary))

What the heck is going on with this buffer? But once I get this figured out, I should just be able to do a spatial filter:

wq_within_wsr_boundary <- wq_within[wsr_boundary,]
## although coordinates are longitude/latitude, st_intersects assumes that they are planar
## although coordinates are longitude/latitude, st_intersects assumes that they are planar
WQP SITES DOWNSTREAM OF WSRs

Downstream analysis will require something different, since I am only interested in WQ data that is ABSOLUTELY downstream of WSRs…For instance, a WQP site that is within some buffer of the downstream reaches could still be from a tributary to that downstream flowline (so not technically connected to the WSR).